72 research outputs found
A Review of Off-Policy Evaluation in Reinforcement Learning
Reinforcement learning (RL) is one of the most vibrant research frontiers in
machine learning and has been recently applied to solve a number of challenging
problems. In this paper, we primarily focus on off-policy evaluation (OPE), one
of the most fundamental topics in RL. In recent years, a number of OPE methods
have been developed in the statistics and computer science literature. We
provide a discussion on the efficiency bound of OPE, some of the existing
state-of-the-art OPE methods, their statistical properties and some other
related research directions that are currently actively explored.Comment: Still under revisio
- …